# Load packages
library(bayesrules) 
library(tidyverse) 
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.5     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.4     ✓ stringr 1.4.0
## ✓ readr   2.0.2     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(rstan) 
## Loading required package: StanHeaders
## rstan (Version 2.21.2, GitRev: 2e1f913d3ca3)
## For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores()).
## To avoid recompilation of unchanged Stan programs, we recommend calling
## rstan_options(auto_write = TRUE)
## 
## Attaching package: 'rstan'
## The following object is masked from 'package:tidyr':
## 
##     extract
library(rstanarm) 
## Loading required package: Rcpp
## This is rstanarm version 2.21.1
## - See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!
## - Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.
## - For execution on a local, multicore CPU with excess RAM we recommend calling
##   options(mc.cores = parallel::detectCores())
## 
## Attaching package: 'rstanarm'
## The following object is masked from 'package:rstan':
## 
##     loo
library(bayesplot) 
## This is bayesplot version 1.8.1
## - Online documentation and vignettes at mc-stan.org/bayesplot
## - bayesplot theme set to bayesplot::theme_default()
##    * Does _not_ affect other ggplot2 plots
##    * See ?bayesplot_theme_set for details on theme setting
library(tidybayes) 
library(janitor) 
## 
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test
library(broom.mixed)

Exercise 9.1

  1. Since a model line can cross anywhere along the y-axis and the slope of a line can be any positive or negative value (or even 0), the intercept and slope regression parameters, β0 and β1, can technically take any values in the real line. So it’s reasonable to utilize Normal prior models which also live on the entire real line, for β0 and β1.

  2. Since the standard deviation parameter σ must be positive, it’s unreasonable to utilize a Normal prior model which is not restricted to positive values but lives on the entire real line for σ.

  3. While both vague priors and weakly informative priors reflect general prior uncertainty about the model parameters, a vague prior might be so vague that it puts weight on non-sensible parameter values. By contrast, weakly informative priors are relatively more focused for they reflect the general prior uncertainty across a range of sensible parameter values. Weakly informative priors are tuned to identify sensible parameter values by taking into account the scales of the variables of interest, which fosters computationally efficient posterior simulation since the chains don’t have to waste time exploring non-sensible parameter values while vague priors do not.

Exercise 9.2

  1. Here the response variable (Y) is a person’s height, and the predictor variable (X) is a person’s arm length.

  2. Here the response variable (Y) is a person’s carbon footprint (in annual CO2 emissions), and the predictor variable (X) is the distance between a person’s home and work.

  3. Here the response variable (Y) is a child’s vocabulary level, and the predictor variable (X) is a child’s age.

  4. Here the response variable (Y) is a person’s reaction time, and the predictor variable (X) is a person’s sleep habits.

Exercise 9.3

  1. β0 indicates the typical height in cm of a baby kangaroo when it is 0 months (Xi = 0). It provides a baseline for where the prior model lives along the y-axis. β1 indicates the typical change in a baby kangaroo’s height in cm for every one unit (month) increase in age. In this particular model with only one quantitative predictor X, the coefficient β1 is equivalent to a slope. My prior understanding suggests that β1 is positive since normally the height of a baby kangaroo increases with its age (as it grows older).

  2. β0 indicates the typical number of GitHub followers of a data scientist when he/she had 0 GitHub commits in the past week (Xi = 0). It provides a baseline for where the prior model lives along the y-axis. β1 indicates the typical change in a data scientist’s number of GitHub followers for every one unit (count of GitHub commit) increase in his/her GitHub commits in the past week. In this particular model with only one quantitative predictor X, the coefficient β1 is equivalent to a slope. My prior understanding suggests that β1 is positive since more GitHub commits in the past week that a a data scientist made (actively revising his/her work) would probably raise people’s attention and interest in his/her work, thus bringing more GitHub followers.

  3. β0 indicates the typical number of visitors to a local park on a day of 0 rainfall in inches (Xi = 0). It provides a baseline for where the prior model lives along the y-axis. β1 indicates the typical change in the number of visitors to a local park on a given day for every one unit (inch) increase in that day’s rainfall. In this particular model with only one quantitative predictor X, the coefficient β1 is equivalent to a slope. My prior understanding suggests that β1 is negative since normally the number of visitors to a local park on a day decreases with the rainfall on that day (people tend to be reluctant to go out visiting a park on rainy days).

  4. β0 indicates the typical hours of Netflix that a person watches on a daily basis when the typical number of hours that they sleep is 0 (Xi = 0). It provides a baseline for where the prior model lives along the y-axis. β1 indicates the typical change in hours of Netflix that a person watches on a daily basis for every one unit (hour) increase in the typical number of hours that they sleep. In this particular model with only one quantitative predictor X, the coefficient β1 is equivalent to a slope. My prior understanding suggests that β1 is negative since normally the more hours a person typically spent their time on sleeping, the fewer hours they would spend on watching Netflix every day (the habit of binge-watching Netflix may decrease a person’s average time for sleep).

Exercise 9.4

At any value of predictor X, the observed values of Y will vary normally around their average μ with consistent standard deviation σ that indicates the degree to which observations might deviate from the model lines. With a small σ, when the observed data deviate little from and are tightly around the mean model line, the observed outcomes Y differ by a little from the average μ at the same value of X, which indicates X is a strong predictor of Y (similarly, with a larger σ, greater variability will be exhibited in observed outcomes among X of the same value, thus reflecting a weaker relationship between the variables).

Exercise 9.5

  1. The response variable Y is a person’s annual orange juice consumption (in gallons), and the predictor variable X is a person’s age (in years).

  2. Yi|μ, σ ∼ N (μ, σ^2) (Yi denotes a person’s annual orange juice consumption in gallons, μ denotes the global mean orange juice in gallons a person consumes annually across different ages combined, σ measures the variability in a person’s annual orange juice consumption in gallons from the global mean μ across people of all ages)

  3. Yi|β0,β1, σ ∼ N (μi, σ^2 ) with μi= β0+ β1Xi ( Xi denotes a person’s age in years, Yi denotes a person’s annual orange juice consumption in gallons, μi denotes the local mean orange juice in gallons a person consumes annually at a certain age, σ measures variability in a person’s annual orange juice consumption in gallons from the local mean μi of people with similar age)

  4. Unknown parameters in the model are β0, β1, and σ. The intercept coefficient β0 can technically take any values since it indicates a person’s annual consumption of orange juice when the person is 0 years old (Xi = 0). It provides a baseline for where our model “lives” along the y-axis. It can be later centered to reflect a person’s typical annual consumption of orange juice at a typical age if it is far outside the norm, e.g. being negative or huge, for our interpretation. The coefficient β1 can take any value (if we assume there’s a linear relationship between Y and X, a person’s annual consumption of orange juice could increase as their age increases for an adult’s appetite is normally larger than a child’s; or a person’s annual consumption of orange juice could decrease as their age increases for children may have a particular preference in orange juice compared to adults who have a lot more other choices; also there could be no relationship between Y and X). σ can take any positive value since it is the standard deviation parameter and must be positive.

  5. Normal prior models are suitable choices for β0 and β1 (m0, s0, m1, s1 are hyperparameters).

β0∼ N(m0, s0^2)

β1∼ N(m1, s1^2)

Generally the intercept and slope regression parameters, β 0and β1, can technically take any values in the real line, so it’s reasonable to utilize Normal prior models for β0 and β1 , which also live on the entire real line. The normal prior models also apply to this specific case where β0 and β1 can take any non-negative value (a subset of all the real numbers), since Yi is measured on a continuous scale and can be beyond 1; and measurements like consumption volume are often symmetrically distributed around some global average, here μ.

Exponential prior model is a suitable choice for σ.

σ ∼ Exp(l)

The Exponential model is a special case of the Gamma with shape s = 1, Gamma(1, r). Since the standard deviation parameter σ must be positive, it’s reasonable to utilize an Exponential model which is also restricted to positive values.

Exercise 9.6

  1. The predictor variable X is the high temperature (in degrees Fahrenheit) on one day, and the response variable Y is the high temperature (in degrees Fahrenheit) on the next day.

  2. Yi|μ, σ ∼ N (μ, σ^2) (Yi denotes the high temperature in degrees Fahrenheit on the next day, μ denotes the global mean high temperature in degrees Fahrenheit tomorrow across different days combined, σ measures the variability in the high temperature in degrees Fahrenheit on the next day from the global mean μ across all days)

  3. Yi|β0,β1, σ ∼ N (μi, σ^2 ) with μi= β0+ β1Xi ( Xi denotes the high temperature in degrees Fahrenheit on one day, Yi denotes the high temperature in degrees Fahrenheit the next day, μi denotes the local mean high temperature in degrees Fahrenheit on a certain day, σ measures variability in the high temperature in degrees Fahrenheit on the next day from the local mean on next days with similar temperature)

  4. Unknown parameters in the model are β0, β1, and σ. The intercept coefficient β0 can technically take any values since it indicates tomorrow’s high temperature when today’s high temperature is 0 (Xi = 0). It provides a baseline for where our model “lives” along the y-axis. It can be later centered to reflect the typical high temperature on the next day of a day with typical high temperature if it is far outside the norm, e.g. being too low for the region or even negative, for our interpretation. The coefficient β1 can take any positive value (if we assume there’s a linear relationship between Y and X, a day with high temperature normally will be followed by the high temperature on the next day since temperature tends not to drop dramatically in two days). σ can take any positive value since it is the standard deviation parameter and must be positive.

  5. Normal prior models are suitable choices for β0 and β1 (m0, s0, m1, s1 are hyperparameters).

β0∼ N(m0, s0^2)

β1∼ N(m1, s1^2)

Generally the intercept and slope regression parameters, β 0and β1, can technically take any values in the real line, so it’s reasonable to utilize Normal prior models for β0 and β1 , which also live on the entire real line. The normal prior models also apply to this specific case where β0 and β1 can take any positive value (a subset of all the real numbers), since Yi is measured on a continuous scale and can be beyond 1; and measurements like high temperature are often symmetrically distributed around some global average, here μ.

Exponential prior model is a suitable choice for σ.

σ ∼ Exp(l)

The Exponential model is a special case of the Gamma with shape s = 1, Gamma(1, r). Since the standard deviation parameter σ must be positive, it’s reasonable to utilize an Exponential model which is also restricted to positive values.

Exercise 9.7

  1. False

  2. True

Exercise 9.8

  1. bunnies_model <- stan_glm(height ~ age, data = bunnies,

                    family = gaussian, 
    
                    prior_intercept = normal(14, 3), 
    
                    prior = normal(3, 1), 
    
                    prior_aux = exponential(0.5), 
    
                    chains = 4, iter = 5000*2, seed = 84735)
  2. songs_model <- stan_glm(Clicks ~ Snaps, data = songs,

                    family = gaussian, 
    
                    prior_intercept = normal(80, 20), 
    
                    prior = normal(100, 50), 
    
                    prior_aux = exponential(0.0025), 
    
                    chains = 4, iter = 5000*2, seed = 84735)
  3. dogs_model <- stan_glm(Happiness ~ Age, data = dogs,

                    family = gaussian, 
    
                    prior_intercept = normal(6, 2), 
    
                    prior = normal(2, 1), 
    
                    prior_aux = exponential(0.25), 
    
                    chains = 4, iter = 5000*2, seed = 84735)

Exercise 9.9

  1. data: Yi|β0,β1, σ ∼ N (μi, σ^2 ) with μi= β0+ β1X

priors: β0c∼ N(5000, 2000^2)

    β1∼ N(-10, 5^2)

    σ ∼ Exp(0.0005) 
bike_priors <- stan_glm(
  rides ~ humidity, data = bikes, 
  family = gaussian, 
  prior_intercept = normal(5000, 2000), 
  prior = normal(-10, 5), 
  prior_aux = exponential(0.0005),
  prior_PD = TRUE,
  chains = 5, iter = 4000*2, seed = 2666)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 1.7e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.337158 seconds (Warm-up)
## Chain 1:                0.061689 seconds (Sampling)
## Chain 1:                0.398847 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 2e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.02 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.214381 seconds (Warm-up)
## Chain 2:                0.045623 seconds (Sampling)
## Chain 2:                0.260004 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 3e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.19902 seconds (Warm-up)
## Chain 3:                0.042684 seconds (Sampling)
## Chain 3:                0.241704 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 5e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.285785 seconds (Warm-up)
## Chain 4:                0.048569 seconds (Sampling)
## Chain 4:                0.334354 seconds (Total)
## Chain 4: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 5).
## Chain 5: 
## Chain 5: Gradient evaluation took 5e-06 seconds
## Chain 5: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
## Chain 5: Adjust your expectations accordingly!
## Chain 5: 
## Chain 5: 
## Chain 5: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 5: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 5: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 5: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 5: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 5: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 5: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 5: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 5: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 5: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 5: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 5: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 5: 
## Chain 5:  Elapsed Time: 0.232447 seconds (Warm-up)
## Chain 5:                0.046736 seconds (Sampling)
## Chain 5:                0.279183 seconds (Total)
## Chain 5:
# 100 prior model lines
bikes %>%
add_fitted_draws (bike_priors, n = 100) %>% 
  ggplot(aes(x = humidity, y = rides)) + 
  geom_line(aes(y = .value, group = .draw), alpha = 0.15) 
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

# 4 prior simulated datasets
set.seed(6) 

bikes %>%
add_predicted_draws(bike_priors, n = 4) %>% 
  ggplot(aes(x = humidity, y = rides)) +
  geom_point(aes(y = .prediction, group = .draw)) +
  facet_wrap(~ .draw)
## Warning: 
## In add_predicted_draws(): The `n` argument is a deprecated alias for `ndraws`.
## Use the `ndraws` argument instead.
## See help("tidybayes-deprecated").

  1. The 100 prior model lines capture our prior understanding that ridership decreases with humidity. The ridership tends to be around 5000 on average humidity days but could range from roughly 1000 to 9000. The variability in these lines adequately reflects our overall uncertainty about this association: Ridership is weakly associated with humidity and ridership exhibits great variability (a standard deviation of 2000 rides) at any given humidity. Ridership typically decreases by 10 rides with every one percentage point increase in humidity level, though this average decrease could range from 0 to 20. The four ridership datasets simulated from the Normal data model using five prior plausible sets of (β0, β1, σ) values also shows consistent rate of decrease in ridership with humidity, the baseline ridership, and the variability in ridership with our prior assumptions.

Exercise 9.10

# Load and plot data
data(bikes) 
ggplot(bikes, aes(x = humidity, y = rides)) +
  geom_point(size = 0.5) +
  geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'

  1. Simple Normal regression seems to be a reasonable approach to modeling this relationship. Ridership and humidity can both be measured on a ratio scale. The plots shows that the relationship between the response variable ridership and the predictor humidity is linear, relatively weak, and has a negative slope: ridership decreases with humidity. For a given value of X, the Y values are independent, roughly normally distributed around the average mean, and the probability distribution of Y has the same standard deviation σ. So we can build a Bayesian simple Normal regression model of the quantitative response variable ridership by the quantitative predictor variable humidity that indicates the linear dependence of ridership on humidity.

Exercise 9.11

# Perform a posterior simulation
bike_posterior <- update(bike_priors, prior_PD = FALSE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 1.7e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.14892 seconds (Warm-up)
## Chain 1:                0.189265 seconds (Sampling)
## Chain 1:                0.338185 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 5e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.23149 seconds (Warm-up)
## Chain 2:                0.196033 seconds (Sampling)
## Chain 2:                0.427523 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 4e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.116788 seconds (Warm-up)
## Chain 3:                0.195764 seconds (Sampling)
## Chain 3:                0.312552 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 4e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.291 seconds (Warm-up)
## Chain 4:                0.189464 seconds (Sampling)
## Chain 4:                0.480464 seconds (Total)
## Chain 4: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 5).
## Chain 5: 
## Chain 5: Gradient evaluation took 1.2e-05 seconds
## Chain 5: 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
## Chain 5: Adjust your expectations accordingly!
## Chain 5: 
## Chain 5: 
## Chain 5: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 5: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 5: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 5: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 5: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 5: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 5: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 5: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 5: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 5: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 5: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 5: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 5: 
## Chain 5:  Elapsed Time: 0.135 seconds (Warm-up)
## Chain 5:                0.183146 seconds (Sampling)
## Chain 5:                0.318146 seconds (Total)
## Chain 5:
mcmc_trace(bike_posterior, size = 0.1) 

mcmc_dens_overlay(bike_posterior)

mcmc_acf(bike_posterior)

neff_ratio(bike_posterior)
## (Intercept)    humidity       sigma 
##     1.02195     1.01190     0.93110
rhat(bike_posterior)
## (Intercept)    humidity       sigma 
##    1.000002    1.000077    1.000010

There’s no discernible trend or flat lines in the trace plots and the density plots are consistent across five chains. The autocorrelation quickly drops off and is effectively 0 by lag 5. The effective sample size ratios are around 1 and the R-hat values are very close to 1. All this indicates that the chains are stable, mixing quickly, and behaving much like an independent sample. So we can “trust” these simulation results.

# 100 posterior model lines
bikes %>%
add_fitted_draws (bike_posterior, n = 100) %>% 
  ggplot(aes(x = humidity, y = rides)) + 
  geom_line(aes(y = .value, group = .draw), alpha = 0.15)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

Compared to the prior model lines, the posterior model lines indicates that the decrease in ridership with humidity appears to be a bit less steep than we had anticipated. Further, the posterior plausible models are far less variable, indicating that we’re far more confident about the relationship between ridership and humidity upon observing the data.

Exercise 9.12

# Posterior summary statistics
tidy(bike_posterior, effects = c("fixed", "aux"), 
     conf.int = TRUE, conf.level = 0.95)
## # A tibble: 4 × 5
##   term        estimate std.error conf.low conf.high
##   <chr>          <dbl>     <dbl>    <dbl>     <dbl>
## 1 (Intercept)  4020.      226.     3581.    4457.  
## 2 humidity       -8.45      3.39    -15.0     -1.75
## 3 sigma        1574.       50.2    1481.    1679.  
## 4 mean_PPD     3484.      100.     3288.    3683.
  1. The posterior median value of the σ parameter 1574 indicates that on average, we can expect the observed ridership on a given day to deviate 1574 rides from the average ridership on days of the same humidity.

  2. The 95% posterior credible interval for the humidity coefficient β1 , (-15.01588, -1.752616), indicates that this slope could range anywhere between -15.01588 and -1.752616.

  3. I think we have ample posterior evidence that there’s a negative association between ridership and humidity. In our visual examination of 100 posterior plausible scenarios for the relationship between ridership and humidity in Exercise 9.11 part c, they all exhibited negative associations. A line exhibiting no relationship or positive relationship (β1 = 0 or β1 > 0) would stand out. More rigorously, the 95% credible interval for β1 in the above tidy summary (-15.01588, -1.752616) lies entirely and well below 0.

We can also do a quick tabulation to approximate that there’s almost certainly a negative association, P (β1 < 0 | y) ≈ 1. Almost all of the 20000 Markov chain values of β1 are negative.

# Tabulate the beta_1 values that below 0
bike_posterior_df <- as.data.frame(bike_posterior)
bike_posterior_df %>%
mutate(below_0 = humidity < 0) %>% 
  tabyl(below_0)
##  below_0     n percent
##    FALSE   126  0.0063
##     TRUE 19874  0.9937

Exercise 9.13

# Predict rides for each parameter set in the chain
set.seed(84735) 

predict_90 <- bike_posterior_df %>%
  mutate(mu = `(Intercept)` + humidity*90, 
         y_new = rnorm(20000, mean = mu, sd = sigma))

head(predict_90, 3)
##   (Intercept)   humidity    sigma       mu    y_new
## 1    4149.515 -10.895459 1527.359 3168.924 4188.028
## 2    3917.499  -5.949563 1591.149 3382.038 3184.990
## 3    3977.836  -6.798818 1573.313 3365.943 4594.492

The collection of 20,000 mu values approximates the posterior model for the typical ridership on 90% humidity days, μ = β0 + β1 ∗ 90, the 20,000 y_new values approximate the posterior predictive model of ridership for tomorrow, an individual 90% humidity day.

# Plot the posterior model of the typical ridership on 90% humidity days
ggplot(predict_90, aes(x = mu)) + 
  geom_density()

# Plot the posterior predictive model of tomorrow's ridership
ggplot(predict_90, aes(x = y_new)) + 
  geom_density()

Though they’re centered at roughly similar value (range), the posterior predictive model for mu is much narrower than that of y_new. The posterior model for μ merely captures the uncertainty in the average ridership on all X = 90% humidity days. Since there is so little uncertainty about this average, this interval visually appears like a wee dot. In contrast, the posterior predictive model for the number of rides tomorrow (a specific day) accounts for not only the average ridership on a 90% humidity day, but the individual variability from this average.

# Construct 80% posterior credible intervals

predict_90 %>%
  summarize(lower_new = quantile(y_new, 0.1), 
            upper_new = quantile(y_new, 0.9))
##   lower_new upper_new
## 1  1261.242  5292.553

The 80% posterior prediction interval for the number of rides tomorrow is (1261, 5293), indicating that there’s a 80% posterior probability that the number of rides tomorrow is somewhere between 1261 and 5293.

# Simulate a set of predictions
set.seed(84735) 

shortcut_prediction <- posterior_predict(bike_posterior, 
 newdata = data.frame(humidity = 90))

# Construct a 80% posterior credible interval
posterior_interval(shortcut_prediction, prob = 0.80) 
##        10%      90%
## 1 1261.242 5292.553
# Plot the approximate predictive model
mcmc_dens(shortcut_prediction) + 
  xlab("predicted ridership on a 90% humidity day")

Exercise 9.14

  1. Based on common sense and my experience, I have the following prior understanding of this relationship: On an average windspeed day, there are typically around 5000 riders, though this average could be somewhere between 1000 and 9000.

Ridership tends to decrease as windspeed increases. Specifically, for every one percentage point increase in windspeed level, the ridership tends to decrease by 6 rides, though this average decrease could be anywhere between 0 and 12.

Ridership is only weakly related to windspeed. At any given windspeed, ridership will tend to vary with a large standard deviation of 1000 rides.

  1. data: Yi|β0,β1, σ ∼ N (μi, σ^2 ) with μi= β0+ β1X

priors: β0c∼ N(5000, 2000^2)

    β1∼ N(-6, 3^2)

    σ ∼ Exp(0.001) 
ridership_priors <- stan_glm(
  rides ~ windspeed, data = bikes, 
  family = gaussian, 
  prior_intercept = normal(5000, 2000), 
  prior = normal(-6, 3), 
  prior_aux = exponential(0.001),
  prior_PD = TRUE,
  chains = 5, iter = 4000*2, seed = 2666)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 1.8e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.340919 seconds (Warm-up)
## Chain 1:                0.061552 seconds (Sampling)
## Chain 1:                0.402471 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 6e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.06 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.211955 seconds (Warm-up)
## Chain 2:                0.045081 seconds (Sampling)
## Chain 2:                0.257036 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 3e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.201759 seconds (Warm-up)
## Chain 3:                0.044505 seconds (Sampling)
## Chain 3:                0.246264 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 5e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.293536 seconds (Warm-up)
## Chain 4:                0.048217 seconds (Sampling)
## Chain 4:                0.341753 seconds (Total)
## Chain 4: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 5).
## Chain 5: 
## Chain 5: Gradient evaluation took 2e-06 seconds
## Chain 5: 1000 transitions using 10 leapfrog steps per transition would take 0.02 seconds.
## Chain 5: Adjust your expectations accordingly!
## Chain 5: 
## Chain 5: 
## Chain 5: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 5: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 5: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 5: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 5: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 5: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 5: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 5: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 5: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 5: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 5: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 5: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 5: 
## Chain 5:  Elapsed Time: 0.228953 seconds (Warm-up)
## Chain 5:                0.045629 seconds (Sampling)
## Chain 5:                0.274582 seconds (Total)
## Chain 5:
# 100 prior model lines
bikes %>%
add_fitted_draws (ridership_priors, n = 100) %>% 
  ggplot(aes(x = windspeed, y = rides)) + 
  geom_line(aes(y = .value, group = .draw), alpha = 0.15) 
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

# 4 prior simulated datasets
set.seed(6) 

bikes %>%
add_predicted_draws(ridership_priors, n = 4) %>% 
  ggplot(aes(x = windspeed, y = rides)) +
  geom_point(aes(y = .prediction, group = .draw)) +
  facet_wrap(~ .draw)
## Warning: 
## In add_predicted_draws(): The `n` argument is a deprecated alias for `ndraws`.
## Use the `ndraws` argument instead.
## See help("tidybayes-deprecated").

The 100 prior model lines capture my prior understanding that ridership decreases with windspeed. The ridership tends to be around 5000 on average windspeed days but could range from roughly 1000 to 9000. The variability in these lines adequately reflects my overall uncertainty about this association: Ridership is weakly associated with windspeed and ridership exhibits great variability (a standard deviation of 1000 rides) at any given windspeed. Ridership typically decreases by 6 rides with every one percentage point increase in windspeed level, though this average decrease could range from 0 to 12. The four ridership datasets simulated from the Normal data model using five prior plausible sets of (β0, β1, σ) values also shows consistent rate of decrease in ridership with windspeed, the baseline ridership, and the variability in ridership with my prior assumptions.

# Load and plot data
data(bikes) 
ggplot(bikes, aes(x = windspeed, y = rides)) +
  geom_point(size = 0.5, alpha = 0.5) +
  geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'

The observed patterns and my prior understanding both indicate that ridership decreases with windspeed. However, the observed patterns indicates that the ridership tends to be around 3500 on average windspeed days and could range from roughly 0 to 7000. The variability in the lines from both the observed patterns and my prior understanding adequately reflects the overall uncertainty about this association: Ridership is relatively weakly associated with windspeed. Ridership in the observed patterns and my prior understanding exhibits similar variability at a given windspeed. While my prior understanding indicates that ridership typically decreases by 6 rides with every one percentage point increase in windspeed level and the average decrease ranges from 0 to 12, the observed patterns show a higher rate of decrease in ridership with windspeed.

Exercise 9.15

# Perform a posterior simulation
ridership_posterior <- update(ridership_priors, prior_PD = FALSE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 1.3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.139146 seconds (Warm-up)
## Chain 1:                0.18501 seconds (Sampling)
## Chain 1:                0.324156 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.3e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.3444 seconds (Warm-up)
## Chain 2:                0.191255 seconds (Sampling)
## Chain 2:                0.535655 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 4e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.136543 seconds (Warm-up)
## Chain 3:                0.185369 seconds (Sampling)
## Chain 3:                0.321912 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 4e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.162618 seconds (Warm-up)
## Chain 4:                0.188851 seconds (Sampling)
## Chain 4:                0.351469 seconds (Total)
## Chain 4: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 5).
## Chain 5: 
## Chain 5: Gradient evaluation took 5e-06 seconds
## Chain 5: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
## Chain 5: Adjust your expectations accordingly!
## Chain 5: 
## Chain 5: 
## Chain 5: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 5: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 5: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 5: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 5: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 5: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 5: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 5: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 5: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 5: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 5: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 5: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 5: 
## Chain 5:  Elapsed Time: 0.142343 seconds (Warm-up)
## Chain 5:                0.188961 seconds (Sampling)
## Chain 5:                0.331304 seconds (Total)
## Chain 5:
# MCMC diganostics
mcmc_trace(ridership_posterior, size = 0.1) 

mcmc_dens_overlay(ridership_posterior)

mcmc_acf(ridership_posterior)

neff_ratio(ridership_posterior)
## (Intercept)   windspeed       sigma 
##     0.97910     1.00390     0.98905
rhat(ridership_posterior)
## (Intercept)   windspeed       sigma 
##   0.9998757   0.9998945   1.0000329

There’s no discernible trend or flat lines in the trace plots and the density plots are consistent across five chains. The autocorrelation quickly drops off and is effectively 0 by lag 5. The effective sample size ratios are around 1 and the R-hat values are very close to 1. All this indicates that the chains are stable, mixing quickly, and behaving much like an independent sample. So we can “trust” these simulation results.

# 100 posterior model lines
bikes %>%
add_fitted_draws (ridership_posterior, n = 100) %>% 
  ggplot(aes(x = windspeed, y = rides)) + 
  geom_line(aes(y = .value, group = .draw), alpha = 0.15)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

Compared to the prior model lines, the posterior model lines indicates that the decrease in ridership with windspeed appears to be more steep than we had anticipated. Further, the posterior plausible models are far less variable, indicating that we’re far more confident about the relationship between ridership and windspeed upon observing the data.

# Posterior summary statistics
tidy(ridership_posterior, effects = c("fixed", "aux"), 
     conf.int = TRUE, conf.level = 0.95)
## # A tibble: 4 × 5
##   term        estimate std.error conf.low conf.high
##   <chr>          <dbl>     <dbl>    <dbl>     <dbl>
## 1 (Intercept)  3596.       80.6    3441.    3752.  
## 2 windspeed      -8.63      2.91    -14.3     -2.97
## 3 sigma        1567.       49.9    1473.    1669.  
## 4 mean_PPD     3485.      100.     3287.    3678.

The posterior median value of the σ parameter 1567 indicates that on average, we can expect the observed ridership on a given day to deviate 1567 rides from the average ridership on days of the same windspeed. The 95% posterior credible interval for the windspeed coefficient β1 , (-14.26132 , -2.965237), indicates that this slope could range anywhere between -14.26132 and -2.965237.

Therefore, I think we have ample posterior evidence that there’s a negative association between ridership and windspeed. In our visual examination of 100 posterior plausible scenarios for the relationship between ridership and windspeed above, they all exhibited negative associations. A line exhibiting no relationship or positive relationship (β1 = 0 or β1 > 0) would stand out. More rigorously, the 95% credible interval for β1 in the above tidy summary (-14.26132 , -2.965237) lies entirely and well below 0.

We can also do a quick tabulation to approximate that there’s almost certainly a negative association, P (β1 < 0 | y) ≈ 1. Almost all of the 20000 Markov chain values of β1 are negative.

# Tabulate the beta_1 values that below 0
ridership_posterior_df <- as.data.frame(ridership_posterior)
ridership_posterior_df %>%
mutate(below_0 = windspeed < 0) %>% 
  tabyl(below_0)
##  below_0     n percent
##    FALSE    44  0.0022
##     TRUE 19956  0.9978

Exercise 9.16

penguins_prior_default <- stan_glm(
  flipper_length_mm ~ bill_length_mm, data = penguins_bayes, 
  family = gaussian, 
  prior_intercept = normal(200, 25, autoscale = TRUE), 
  prior = normal(0, 2.5, autoscale = TRUE), 
  prior_aux = exponential(1, autoscale = TRUE),
  prior_PD = TRUE,
  chains = 4, iter = 5000*2, seed = 84735)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 1.1e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.102018 seconds (Warm-up)
## Chain 1:                0.051766 seconds (Sampling)
## Chain 1:                0.153784 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 2e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.02 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.087283 seconds (Warm-up)
## Chain 2:                0.053939 seconds (Sampling)
## Chain 2:                0.141222 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 2e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.02 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.087177 seconds (Warm-up)
## Chain 3:                0.054759 seconds (Sampling)
## Chain 3:                0.141936 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 3e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.092967 seconds (Warm-up)
## Chain 4:                0.054989 seconds (Sampling)
## Chain 4:                0.147956 seconds (Total)
## Chain 4:
prior_summary(penguins_prior_default)
## Priors for model 'penguins_prior_default' 
## ------
## Intercept (after predictors centered)
##   Specified prior:
##     ~ normal(location = 200, scale = 25)
##   Adjusted prior:
##     ~ normal(location = 200, scale = 352)
## 
## Coefficients
##   Specified prior:
##     ~ normal(location = 0, scale = 2.5)
##   Adjusted prior:
##     ~ normal(location = 0, scale = 6.4)
## 
## Auxiliary (sigma)
##   Specified prior:
##     ~ exponential(rate = 1)
##   Adjusted prior:
##     ~ exponential(rate = 0.071)
## ------
## See help('prior_summary.stanreg') for more details

data: Yi|β0,β1, σ ∼ N (μi, σ^2 ) with μi= β0+ β1X

priors: β0c∼ N(200, 352^2)

    β1∼ N(0, 6.4^2)

    σ ∼ Exp(0.071) 
    
# 100 prior model lines
penguins_1 <- na.omit(penguins_bayes) %>% 
  add_fitted_draws(penguins_prior_default, n = 100) %>% 
  ggplot(aes(x = bill_length_mm, y = flipper_length_mm)) + 
  geom_line(aes(y = .value, group = .draw), alpha = 0.15) 
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).
penguins_1

# 4 prior simulated datasets
set.seed(6) 

penguins_2 <- na.omit(penguins_bayes) %>% 
add_predicted_draws(penguins_prior_default, ndraws = 4) %>% 
  ggplot(aes(x = bill_length_mm, y = flipper_length_mm)) +
  geom_point(aes(y = .prediction, group = .draw)) +
  facet_wrap(~ .draw)

penguins_2

  1. My weakly informative priors reflect much greater uncertainty about the relationship between flipper and bill length. As the Normal prior for β1 is centered at 0, the model lines indicate that the association between flipper and bill length might be positive (β1 > 0), non-existent (β 1 = 0), or negative (β 1 < 0). Some of the simulated data points even include negative flipper length values. Further, the simulated datasets reflect that the relationship between flipper and bill length is relatively strong (with σ near 0). By utilizing weakly informative priors instead of totally vague priors, my prior uncertainty is still in the right ballpark. My priors focus on flipper length being in the hundreds (reasonable), not in the thousands or millions (unreasonable for a penguin’s flipper).

Exercise 9.17

# Load and plot data
data(penguins_bayes) 
ggplot(penguins_bayes, aes(x = bill_length_mm, y = flipper_length_mm)) +
  geom_point(size = 0.5) +
  geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 2 rows containing non-finite values (stat_smooth).
## Warning: Removed 2 rows containing missing values (geom_point).

The observed data indicates that the relationship between flipper and bill length is that flipper length increases with bill length. The flipper length tends to be around 200 mm with an average bill length 44 mm but could range from roughly 170 to 230 mm. The variability from the observed patterns reflects some extent of overall uncertainty about the association, but flipper length is strongly associated with bill length and flipper length and exhibits small variability at any given bill length.

penguins_4 <- na.omit(penguins_bayes) %>% 
  summarise(mean_bill_length = mean(bill_length_mm), 
            mean_flipper_length = mean(flipper_length_mm),
            std_dev = sd(flipper_length_mm))

penguins_4
## # A tibble: 1 × 3
##   mean_bill_length mean_flipper_length std_dev
##              <dbl>               <dbl>   <dbl>
## 1             44.0                201.    14.0
  1. Simple Normal regression seems to be a reasonable approach to modeling this relationship. Flipper length and bill length can both be measured on a ratio scale. The plots shows that the relationship between the response variable flipper length and the predictor bill length is linear, strong, and has a positive slope: flipper length increases with bill length. For a given value of X, the Y values are independent, roughly normally distributed around the average mean, and the probability distribution of Y has the same standard deviation σ. So we can build a Bayesian simple Normal regression model of the quantitative response variable flipper length by the quantitative predictor variable bill length that indicates the linear dependence of flipper length on bill length.

Exercise 9.18

# Perform a posterior simulation
penguins_posterior <- update(penguins_prior_default, prior_PD = FALSE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 3.7e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.37 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.082209 seconds (Warm-up)
## Chain 1:                0.19302 seconds (Sampling)
## Chain 1:                0.275229 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 7e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.080224 seconds (Warm-up)
## Chain 2:                0.188762 seconds (Sampling)
## Chain 2:                0.268986 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 4e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.076329 seconds (Warm-up)
## Chain 3:                0.187483 seconds (Sampling)
## Chain 3:                0.263812 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 4e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.076953 seconds (Warm-up)
## Chain 4:                0.186265 seconds (Sampling)
## Chain 4:                0.263218 seconds (Total)
## Chain 4:
# 100 posterior model lines
penguins_3 <- na.omit(penguins_bayes) %>% 
add_fitted_draws (penguins_posterior, n = 100) %>% 
  ggplot(aes(x = bill_length_mm, y = flipper_length_mm)) + 
  geom_line(aes(y = .value, group = .draw), alpha = 0.15)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).
penguins_3

# Posterior summary statistics
tidy(penguins_posterior, effects = c("fixed", "aux"), 
     conf.int = TRUE, conf.level = 0.9)
## # A tibble: 4 × 5
##   term           estimate std.error conf.low conf.high
##   <chr>             <dbl>     <dbl>    <dbl>     <dbl>
## 1 (Intercept)      127.       4.72    119.      134.  
## 2 bill_length_mm     1.69     0.107     1.52      1.86
## 3 sigma             10.6      0.407    10.0      11.3 
## 4 mean_PPD         201.       0.809   200.      202.
  1. The 90% posterior credible interval for the bill length coefficient β1 , (1.516686, 1.86384), indicates that this slope could range anywhere between 1.516686 and 1.86384.

  2. I think we have ample posterior evidence that there’s a positive association between flipper length and bill length. In our visual examination of 100 posterior plausible scenarios for the relationship between flipper length and bill length in part b, they all exhibited positive associations. A line exhibiting no relationship or negative relationship (β1 = 0 or β1 < 0) would stand out. More rigorously, the 90% credible interval for β1 in the above tidy summary (1.516686, 1.86384) lies entirely and well above 0.

We can also do a quick tabulation to approximate that there’s almost certainly a positive association, P (β1 > 0 | y) ≈ 1. All of the 20000 Markov chain values of β1 are positive.

# Tabulate the beta_1 values that above 0
penguins_posterior_df <- as.data.frame(penguins_posterior)
penguins_posterior_df %>%
mutate(above_0 = bill_length_mm > 0) %>% 
  tabyl(above_0)
##  above_0     n percent
##     TRUE 20000       1

Exercise 9.19

# Predict flipper length for each parameter set in the chain
set.seed(84735) 

predict_51 <- penguins_posterior_df %>%
  mutate(mu = `(Intercept)` + bill_length_mm*51, 
         y_new = rnorm(20000, mean = mu, sd = sigma))

head(predict_51, 3)
##   (Intercept) bill_length_mm     sigma       mu    y_new
## 1    130.3616       1.601364  9.994198 212.0312 218.6996
## 2    129.5932       1.620827 10.365243 212.2554 210.9718
## 3    131.9384       1.571693 10.514349 212.0947 220.3050

The collection of 20,000 mu values approximates the posterior model for the typical flipper length among penguins with 51mm bills, μ = β0 + β1 ∗ 51, the 20,000 y_new values approximate the posterior predictive model for Pablo’s flipper length, an individual penguin whose bill is 51mm long.

# Plot the posterior model of the typical flipper length among penguins with 51mm bills
ggplot(predict_51, aes(x = mu)) + 
  geom_density()

# Plot the posterior predictive model for Pablo’s flipper length
ggplot(predict_51, aes(x = y_new)) + 
  geom_density()

Though they’re centered at roughly similar value (range), the posterior predictive model for mu is narrower than that of y_new. The posterior model for μ merely captures the uncertainty in the average flipper length among all penguins with X = 51mm bills. Since there is so little uncertainty about this average, this interval visually appears like a wee dot. In contrast, the posterior predictive model for Pablo’s flipper length (a specific penguin) accounts for not only the average flipper length among penguins with 51mm bills, but the individual variability from this average.

# Construct 80% posterior credible intervals
predict_51 %>%
  summarize(lower_new = quantile(y_new, 0.1), 
            upper_new = quantile(y_new, 0.9))
##   lower_new upper_new
## 1  199.3855  226.5989

The 80% posterior prediction interval for Pablo’s flipper length is (199, 227), indicating that there’s a 80% posterior probability that Pablo’s flipper length is somewhere between 199 and 227 mm.

# Construct 80% posterior credible intervals
predict_51 %>%
  summarize(lower_mu = quantile(mu, 0.1), 
            upper_mu = quantile(mu, 0.9))
##   lower_mu upper_mu
## 1 211.6763 214.0843

The 80% credible interval for the typical flipper length among all penguins with 51mm bills (212, 214) is narrower than the 80% posterior prediction interval for Pablo’s flipper length since the posterior model for μ merely captures the uncertainty in the average flipper length among all penguins with X = 51mm bills and there is so little uncertainty about this average, leading to a small range of posterior plausible values concentrated along the credible interval. In contrast, the posterior predictive model for Pablo’s flipper length (a specific penguin) accounts for not only the average flipper length among penguins with 51mm bills, but the individual variability from this average, thus exhibiting more variability and leading to a larger range of posterior plausible values (wider credible interval). So with the same 80% credible level, the 80% credible interval for the typical flipper length among all penguins with 51mm bills is narrower.

# Simulate a set of predictions
set.seed(84735) 

shortcut_prediction_1 <- posterior_predict(penguins_posterior, 
 newdata = data.frame(bill_length_mm = 51))

# Construct a 80% posterior credible interval
posterior_interval(shortcut_prediction_1, prob = 0.80) 
##        10%      90%
## 1 199.3855 226.5989
# Plot the approximate predictive model
mcmc_dens(shortcut_prediction_1) + 
  xlab("predicted flipper length among penguins with 51mm bills")

Exercise 9.20

  1. Researchers’ prior understanding of the relationship between flipper length and body mass is that flipper length increases with body mass. They believe the flipper length typically increases by 0.01 mm with every one g increase in body mass, though this average increase could range from 0.006 to 0.014.

# Load and plot data
data(penguins_bayes) 
ggplot(penguins_bayes, aes(x = body_mass_g, y = flipper_length_mm)) +
  geom_point(size = 0.5) +
  geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 2 rows containing non-finite values (stat_smooth).
## Warning: Removed 2 rows containing missing values (geom_point).

The observed data indicates that the relationship between flipper and body mass is that flipper length increases with body mass. The flipper length tends to be around 200 with an average body mass 4207 g but could range from roughly 170 to 230. The variability from the observed patterns reflects some extent of overall uncertainty about the association, yet flipper length is strongly associated with body mass and flipper length exhibits small variability at each given body mass.

penguins_5 <- na.omit(penguins_bayes) %>% 
  summarise(mean_body_mass_g = mean(body_mass_g), 
            mean_flipper_length = mean(flipper_length_mm),
            std_dev = sd(flipper_length_mm))

penguins_5
## # A tibble: 1 × 3
##   mean_body_mass_g mean_flipper_length std_dev
##              <dbl>               <dbl>   <dbl>
## 1            4207.                201.    14.0
  1. I think in a simple Normal regression model of flipper length Y by one predictor X, σ parameter is bigger when X = bill length. By comparing the plot of observed relationship between flipper_length_mm and body_mass_g above and the plot of observed relationship between flipper_length_mm and bill_length_mm in exercise 9.17 part a, observations in the plot of observed relationship between flipper_length_mm and bill_length_mm overall deviate more from the model line. Since σ parameter indicates the degree to which observations deviate from the model line, σ parameter is bigger when X = bill length than when X = body mass.

penguins_posterior_default <- stan_glm(
  flipper_length_mm ~ body_mass_g, data = penguins_bayes, 
  family = gaussian, 
  prior_intercept = normal(200, 15, autoscale = TRUE), 
  prior = normal(0.01, 0.002, autoscale = TRUE), 
  prior_aux = exponential(1, autoscale = TRUE),
  prior_PD = FALSE,
  chains = 4, iter = 5000*2, seed = 84735)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 1.3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.083108 seconds (Warm-up)
## Chain 1:                0.192828 seconds (Sampling)
## Chain 1:                0.275936 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 7e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.098032 seconds (Warm-up)
## Chain 2:                0.186924 seconds (Sampling)
## Chain 2:                0.284956 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 6e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.06 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.083339 seconds (Warm-up)
## Chain 3:                0.19094 seconds (Sampling)
## Chain 3:                0.274279 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 5e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.096285 seconds (Warm-up)
## Chain 4:                0.189164 seconds (Sampling)
## Chain 4:                0.285449 seconds (Total)
## Chain 4:
# Store the 4 chains for each parameter in 1 data frame
penguins_posterior_default_df <- as.data.frame(penguins_posterior_default)

# Plot the posterior model of the β1 body_mass_g coefficient
ggplot(penguins_posterior_default_df, aes(x = body_mass_g)) + 
  geom_density()

The researchers’ posterior understanding of the relationship between flipper length and mass is that flipper length increases with body mass. The flipper length tends to be around 159 with average body mass but could range from 1.579005e+02 to 160. The variability from the observed patterns reflects some extent of overall uncertainty about the association. The posterior model indicates relative posterior certainty about the strength in the relationship between Flipper length and mass. We can see flipper length is strongly associated with body mass and flipper length exhibits moderate variability at each given body mass (on average, we can expect the observed flipper length of a penguin of a given mass to deviate 0.1234 mm from the average flipper length of a given pengui with the same mass). Flipper length typically increases by 0.01002171 mm with every one g increase in body mass, though this average increase could range from 9.953034e-03 to 0.0100905. Compared to researchers’ prior understanding, the typical flipper length with average body mass is shorter and has a much smaller standard deviation, the average increase in flipper length grows a bit and they become more certain in its range, the variability in flipper length with same body mass is smaller based on researchers’ posterior understanding. Overall, researchers are far more confident about the relationship between flipper length and body mass upon observing some data.

tidy(penguins_posterior_default, effects = c("fixed", "aux"), 
     conf.int = TRUE, conf.level = 0.95)
## # A tibble: 4 × 5
##   term        estimate std.error  conf.low conf.high
##   <chr>          <dbl>     <dbl>     <dbl>     <dbl>
## 1 (Intercept) 159.     0.461     158.       160.    
## 2 body_mass_g   0.0100 0.0000350   0.00995    0.0101
## 3 sigma         8.10   0.312       7.53       8.75  
## 4 mean_PPD    201.     0.628     200.       202.
penguins_prior_default_1 <- update(penguins_posterior_default, prior_PD = TRUE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 1.2e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.091696 seconds (Warm-up)
## Chain 1:                0.055498 seconds (Sampling)
## Chain 1:                0.147194 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 5e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.07753 seconds (Warm-up)
## Chain 2:                0.083702 seconds (Sampling)
## Chain 2:                0.161232 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 6e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.06 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.076581 seconds (Warm-up)
## Chain 3:                0.078213 seconds (Sampling)
## Chain 3:                0.154794 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 6e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.06 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.079678 seconds (Warm-up)
## Chain 4:                0.058486 seconds (Sampling)
## Chain 4:                0.138164 seconds (Total)
## Chain 4:
prior_summary(penguins_prior_default_1)
## Priors for model 'penguins_prior_default_1' 
## ------
## Intercept (after predictors centered)
##   Specified prior:
##     ~ normal(location = 200, scale = 15)
##   Adjusted prior:
##     ~ normal(location = 200, scale = 211)
## 
## Coefficients
##   Specified prior:
##     ~ normal(location = 0.01, scale = 0.002)
##   Adjusted prior:
##     ~ normal(location = 0.01, scale = 3.5e-05)
## 
## Auxiliary (sigma)
##   Specified prior:
##     ~ exponential(rate = 1)
##   Adjusted prior:
##     ~ exponential(rate = 0.071)
## ------
## See help('prior_summary.stanreg') for more details